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Abstract 

Evaluators of a statewide systemic school reform effort used Likert-type survey items to 
assess teachers’ satisfaction with the reform effort. But they also asked teachers to 
respond to an open-ended item on the conversations about teaching and learning they had 
engaged in during the previous 18 months. Though response rates were low, qualitative 
analysis revealed that the teachers had conversed about the same topics the school reform 
effort had promoted. Discussing evaluation findings led the leadership of the school 
reform effort to new understandings of the kind of evaluation data they needed to 
continue to monitor their efforts. 
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What Kind of Data are Needed to Evaluate a Statewide Systemic Education Reform? 

This article reports on efforts to ascertain the effectiveness of an initiative, Re.Learning 
New Mexico, in promoting systemic statewide change in P-12 schools. It arose out of a study 
conducted by graduate students in school administration. This study provides answers to four 
questions: First, what did the evaluation Team and the Re:Leaming staff learn about 
Re:Leaming’s effectiveness from the data collected in spring 2000? Second, what did they learn 
about the worth of that approach to evaluation? Third, how has Re.Learning changed since the 
evaluation? Finally, what evaluation approaches may be indicated for the future? 

Re:Learning New Mexico 

Nearly two third of New Mexico’s public school students are ethnic minority, and just 
under one-third live in poverty. On the percent of students scoring at or above Proficient on the 
components of the National Assessment of Educational Process, New Mexico’s scores range 
from six to 13 points below the national average, with a median difference of ten points. 
(Education Week, 2002, p. 71). The National Center for Public Policy and Higher Education 
([NCPPHE] 2000) reported that New Mexico is well below the top states in the Nation on high 
school completion, K-12 course taking, and completion of a higher education program. As will 
be apparent from evaluation data below, Re:Leaming schools are experiencing excessive 
principal turnover. 

Re:Leaming, with a budget of approximately $4.50 per student in New Mexico public K- 
12 education, attempts to foster academic achievement and equity through providing staff 
development at many levels of the school district, but particularly by working with teachers and 
principals. Re:Leaming has no ongoing appropriation from the state government; each year, 
when the legislature meets, Re:Leaming staff wait to hear if they have been fimded. Perhaps 
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because of its tenuous funding situation, the Re:Learaing New Mexico leadership has struggled 
for years with a dilemma: it believes it can be most effective by concentrating only on schools 
and districts that are willing to commit seriously to a systemic reform effort. Yet New Mexico 
policymakers have urged it, since it receives public money, to refuse no request for assistance 
from any public school. The Re:Leaming leadership has worried that if it refuses such requests, 
it might lose state support, but that on the other hand if it diffuses its resources helping schools 
uncommitted to reform, it will be unable to show tangible results from its efforts and will lose 
state funding anyway. 

The Evaluation Team 

This study arose out of a program evaluation course at New Mexico State University. 

The course project was that students design and implement an evaluation of a real life client’s 
program. In spring 2000, four students formed a group (the Team) to evaluate Re:Leaming. 
Although Re:Leaming had existed for ten years and had had several external evaluators, its 
leaders were open to the Team’s evaluation efforts. Furthermore, its need to justify its initiatives 
to secure continued financing disposed Re:Leaming’s leaders to embrace additional efforts to 
assess its impact. 

Theoretical Framework 

Evaluations to assess change in education often involve pre- posttests to measure 
achievement changes. However, statewide systemic change does not lend itself to this type of 
investigation. Though reformers hope that change efforts will eventually translate into 
improvements in school grades, test performance, student retention, and teacher satisfaction, 
these variables are subject to too many influences for correlations with specific reform efforts to 
be easily identified. Fiulhermore, reformers struggle to come up with indicators that “capture the 
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full range of goals that lie at the heart of all but the most limited school-based reform efforts” 
(Shields & Knapp, 1997, p. 290). 

Kennedy (1999) asserted that, “even when researchers seek to document influences on 
student learning, they are often unable to find adequate measures of the outcomes they seek” (p. 
345). Kennedy outlined four levels of approximations that may be indicative of change. These 
are, in decreasing order of credibility. 

1 . Classroom observations and standardized tests (Classroom observations are costly and 
perhaps unreliable; standardized tests may place an overly narrow range of demands on 
students), 

2. Situated descriptions of teaching, i.e., teachers’ very specific descriptions of their own 
practices (These may be self-serving and difficult to interpret reliably), 

3. Non-situated testimony about practice, i.e., surveys that ask about teacher practices in 
general (These are best thought of as revealing teachers’ espoused principles of practice; 
they may not reveal much about teachers’ theories in action), and 

4. Testimony about effects of policies or programs (These have all the weaknesses of numbers 2 
and 3, but more so). Although the fourth level of approximation is the furthest removed 
from the classroom, it is often the only recourse given time and financial constraints. 

The information needed to support decision-making depends on how reformers define 
their task, e.g., raising test scores, developing standards, or seeking other outcomes. Some 
writers, however, have focused on the process of change, with Fullan insisting that teachers must 
""converse about the meaning of change” (emphasis original, 2001, p. 124); Hargreaves 
describing teachers brainstorming “ideas with their colleagues, ‘sparking ofF one another” 

(1997, p. 12); Wolf, Borko, Elliott, and Mclver emphasizing “teacher-to-teacher” conversations 
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(2000, p. 375); and Shield’s and Knapp writing about the “collaborative engagement of 
stakeholders in decision making” (1997, p. 289). In light of these scholars’ views, documenting 
the extent and content of teachers’ conversations becomes a key evaluation function. 

Guba and Lincoln (1989) proposed evaluation that they called responsive, interpretive, 
and hermeneutic. They suggested that the evaluator “conduct the evaluation in such a way that 
each group must confront and deal with the constructions of all the others” (p. 41). They saw 
evaluation as a formative process that provided opportunities for new understandings to emerge. 
It is precisely these new understandings that Fullan (2001) argued were crucial for successful and 
sustained school reform. 

Patton (1997) argued for utilization-focused evaluation, whereby along with 
development of a credible evaluation design, at least equal attention be devoted to identifying the 
“primary intended users” (Patton, p. 41) of the evaluation findings, involving them intensively in 
the design, and fostering their interest in the findings. Thus the evaluator must “attend to 
specific people who understand, value, and care about evaluation” (Patton, 1997, p. 50). Since 
the class was using Patton’s text, Utilization-Focused Evaluation, the Team pursued the 
utilization-focused approach. 

Procedures 

The team began by seeking the primary intended users. Two team members met with the 
Re:Leaming Steering Committee. The Steering Committee consisted of advocates of school 
reform from K-12 and postsecondary education institutions and representatives from business or 
community groups and a foundation that had provided funding. After the meeting the Team 
determined that the Steering Committee and the Re:Leaming Staff constituted the primary 
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To ascertain what issues and concerns the primary users had, the Team sent them a letter 
or an e-mail asking them to respond to the following statement: “I would really like to 

kno w about the program.” From their responses, the Team determined that the 

topics the project stakeholders were most interested in were related to program effectiveness and 
continuity of change beyond the initial “honeymoon "period. Primary intended users also 
wanted to know for participating schools how long the principal had been at that school, since 
they were concerned that high principal turnover might effect the continuity of change initiatives. 

The previous year (1998-1999) an evaluation team for Re:Leaming had surveyed 
teachers with an eight-item survey that called for responses on a Likert-type scale (Table 1). 



Put Table 1 about here. 



For 1999-2000, the Team reused those eight items and added an open-ended question: 
“During the past eighteen months, what topics of conversation about teaching and learning have 
you engaged in?” 

The Team decided to send the survey to randomly selected schools that had been active 
participants in the program within the last year. Out of a total of 175 schools, the Team selected 
58 (2,100 teachers) using stratified random sampling: (14 high schools, 12 middle schools, and 
32 elementary schools). To enhance the return rate, the Team first made phone calls to the 
school principals informing them that they would be receiving the survey and asking them to 
distribute the surveys to their faculty and return them in the stamped, pre-addressed envelope 
provided. Team members also asked the length of time each principal had been in his/her present 
position. Then the Team sent packets of survey forms to the principals of these schools with a 
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cover letter describing the purpose of the survey and asking their cooperation. After four weeks, 
the Team made reminder phone calls to those principals who had not returned the surveys. 

Findings 

Thirty-three schools responded to the survey with a total of 430 usable questionnaires 
(20% of the teachers in the selected schools). Then the Team asked 18 principals attending a 
meeting of Re;Leaming’s “Principal’s Institute” to distribute surveys in their schools, thus 
tapping another 760 teachers. Returns from this distribution increased the number of usable 
surveys received by some 190 (25% of the teachers) from 13 more schools. 

Findings from the Likert-type Items 

For the Likert-type items, the Team calculated percentages of teachers selecting each 
response. Percents of positive responses are shown in Tables 2 and 3. 



Put Tables 2 and 3 about here. 



The most dramatic finding was that elementary teachers appeared the most satisfied with 
Re:Leaming, and that satisfaction decreased through the middle and high school levels. In fact, 
the median percent of positive responses reported (across the eight items) in Tables 2 and 3 was 
81.5% at the elementary, 75% at middle school and 61% at high school levels. This suggests 
that Re:Leaming has been most successful at the elementary level and may need to reconsider its 
approaches at the high school level. Or, it may be simply that secondary schools, with their 
larger size and traditional commitment to subject specialization, are slower to respond to change 



initiatives. 
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The second finding was that overall, the Principal’s Institute schools did not respond 
substantially more positively than the non-Principal’s Institute schools. This finding was of 
interest because the Team and the Steering Committee had assumed the Principals’ Institute 
schools to have made a greater commitment to Re.Leaming and that the result would be greater 
effectiveness and perhaps greater satisfaction. The median percent reported (across the eight 
items) in Tables 2 and 3 was 78% for the non-Principals’ Institute schools and only 77.5% for 
the Principal’s Institute schools. The survey data did not provide strong support for the notion 
that the level of commitment required of Principals’ Institute schools contributed to more 
positive responses. 

Findings from the Open-ended Question 

The Team then met to read responses to the open-ended question, discuss themes that 
emerged, and categorize responses according to those themes (Cuba & Lincoln, 1989). A total of 
30 separate themes were coded (Seidman, 1997) and these were listed in terms of frequency in 
line with Lee’s advice that even with qualitative material it is sometimes useful to “count the 
countable” (1999, p. 121). 

Four hundred thirty-two teachers at 46 campuses responded to question 12; 243 teachers 
at 25 elementary schools, 86 teachers at 10 middle schools, and 103 teachers at 1 1 high schools. 
Table 4 shows the topics most often mentioned by teachers at the elementary, middle, and high 
school level. 



Put Table 4 about here. 
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Five of the top ten topics at each school level were common to all three levels. These 
were alternative assessment, standards/benchmarks/curriculum mapping, teaching methods, 
learning and teaching styles/brain research, and state mandated tests/accountability. Were these 
topics a result of Re:Leaming’s professional development, or were they simply topics uppermost 
in the minds of teachers generally? All these topics had been the subjects of workshops, 
seminars, and sessions designed by Re:Leaming. Hence it is reasonable to consider that 
ReiLeaming activities “contributed in concrete ways” (Patton, 1997, p. 217) to the popularity of 
these topics of teachers’ conversations. 

For the schools that responded to the survey, the Team divided the total number of items 
mentioned by all teachers by the number of teachers employed at each level. This was done to 
control for size of the schools (presumably larger populations can have more conversations). 
Table 5 shows a “fair” comparison among elementary, middle, and high schools of the “total 
amount of talk about teaching and learning.” We can see from the above that the number of 
teachers reporting conversations about teaching and learning decreased as we move from 
elementary to middle and then to high school. This corroborates the findings from the analyses of 
the Likert-type items, where teacher beliefs about the impact of Re:Leaming on each of the three 
areas decreased across the elementary, middle, and high school continuum. 



Put Table 5 about here. 



Table 6 shows a “fair” comparison of Principals’ Institute to non-Principals’ Institute 
schools. Overall, the Principals’ Institute schools did not report as many conversations about 
teaching and learning as non-Principals’ Institute schools. 
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Put Table 6 about here. 



Finally, we found that Re:Leaming leaders’ concerns about principal mobility were well 
founded. The mean principal tenure at the 46 schools was only 3.6 years (elementary: 4.8 years, 
middle school: 2.4, and high school 2.1). 

Discussion 

In the “Purpose” section, we listed four questions for this study. We answer them here. 
What was Learned about Re:Learning Effectiveness? 

Anecdotal evidence of the effectiveness of Re:Leaming had suggested that it was 
successful. In previous years, external evaluators had asserted, “professional development 
received rave reviews, workshop after workshop,” “staff have done an excellent job of 
structuring projects so that schools grow and improve,” and “the “initiative is grounded in 
research, is supported by key stakeholders, and has evolved through the process of change 
without losing the focus on impacting students, educators, and communities.” The Team, 
assuming that teachers are the best judges of Re:Leaming’s effectivemess, concentrated on 
ascertaining teachers’ attitudes Re:Leaming. Our findings suggest that there are high levels of 
support for and satisfaction with Re:Leaming. Responses to the open-ended question suggest that 

at least some teachers are conversing about the topics Re:Leaming emphasizes. The findings 

\ 

from our analysis of the qualitative data suggest that teachers’ conversations about topics 
initiated by Re:Leaming were extensive and varied. Twenty-nine topics were discussed and in 
many schools, teacher conversations coalesced around topics covered by the project staff This 
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suggests that initiatives developed by the project have had a lasting impact on teachers’ 
conversations and consciousness. 

The relatively lower positive findings from the secondary schools and the unimpressive 
findings from the Principals Institute schools are cause for concern and give Re:Leaming 
information about foci for fine-tuning its efforts. 

What was learned about the worth of this approach to evaluation? 

The combination of quantitative and qualitative data gathered from the survey provided 
the Team and Re:Leaming two approximations from which to assess teachers’ perceptions of 
and involvement in Re:Leaming. The evaluation findings, we believe, add validity to the notion 
that teachers’ self-report of conversations can help assess the impact of a statewide systemic 
change initiative such as Re:Leaming. 

Kennedy (1999) asserted, “no one has attempted to measure the relationship between 
fourth-level testimonials about policy impact and any closer levels of approximation.” Given the 
cost, time, and complexity involved in first level strategies (i.e., classroom observations and 
standardized tests), teacher testimonials might be the best affordable alternatives. Question 12 of 
the survey, by providing an opportunity for teachers to report those topics of discussion they had 
engaged in during the previous 18 months, evoked a greater than expected response. This 
suggests that the quantity and variety of conversations about topics addressed by change 
agencies might provide another level of evaluation. At the same time, Re:Leaming leaders and 
evaluators must face the fact that this survey had a low return rate, despite calls to the campuses 
to urge cooperation with it. The unknown representativeness of the findings is a serious blow to 
the evaluation’s credibility. Further research could investigate whether and the extent to which 
teachers’ self-reports of conversations correlate with findings generated by traditional, and 
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sometimes more expensive, evaluative methods. To measure success with more confidence, 
Re:Leaming leaders should consider allocating funds to acquiring at least some data from 
Keimedy”s (1999) other levels of approximations of change 
Hom> has the Program Changed Since the Evaluation? 

Re:Leaming has decided to offer a three-tiered approach to providing professional 
development to schools. With limited resources and many calls for assistance, Re.Leaming has 
decided that it can both be more effective and can more validly assess its effectiveness by 
adjusting the level of its support to the degree of formal commitment a school and district will 
make to participate Re:Leaming. For the lowest level of commitment, the Re;Leaming will 
offer only regional workshops. For the highest level, it will work intensively with school and 
district staff For the middle level, it offered support between the lowest and highest levels. 

None opted for the lowest level. Simultaneously, Re:Leaming staff report that state policy- 
makers have backed off their earlier position that the Re:Leaming must not refuse support to any 
school that asks. In fact, in 2001-2002, interested schools agreed to commit to either the middle 
or highest level. None expressed interest in the lowest level of participation. 

What Evaluation Approaches may be Indicated for the Future? 

This study helped the Steering Committee clarify its specific goals in promoting systemic 
change, and extended the hermeneutics to include one of Re:Leaming’s focus population (i.e., 
the teachers.) As a result of the Team’s evaluation, the Steering Committee became more 
focused regarding which evaluation questions it wanted answered. The evaluation also 
supported Re:Leaming’s concerns about principal turnover. 

Does rapid turnover of principals lead to discontinuities in school reform efforts? 
Certainly, some research suggests so (Muncey & McQuillan, cited in Wolf et al., 2000). In that 
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light, what chance do the schools in oirr study have to sustain such efforts? In any reform 
situation, it is crucial to have organization member buy-in, especially when the organizational 
leadership is unstable. This makes it particularly important to find a way to monitor teacher 
perceptions of the reform. Some have suggested that leadership roles be diffused throughout the 
school or what Wolf et al. called “cooperative leadership” (p. 366). This concept prompted a 
discussion among the Steering Committee about what a school would look like before, and what 
it might look like after, such a change. One member recommended collecting data on staff 
retention rates as an indicator of effective change. Surveying teachers to discover if they saw any 
change in principals’ leadership and evaluation of staff was suggested. The focus of school staff 
meetings, conversations generated at these, and particularly discussions regarding students’ work 
were also discussed as possible areas of further investigation. 

Conclusion 

The effectiveness of statewide initiatives for school change is complex and difficult to 
assess. To get a fuller picture of the breadth and depth of change initiatives, one must use 
methods other than the quantitative. Change is evident not only in measurable achievement 
outcomes, but also in the quality of relationships within the school itself Conversations 
generated by teachers around the foci of change may be signs of deep processing of the 
substance of change efforts. As school leaders begin to realize this, further research will be 
needed into the natirre and extent of conversations around change efforts. This is particularly 
important given that policymakers “want evidence of better goal setting, rational program 
choices to attain those goals, and documentation of results” (Wang, et al.l998, p. 66). 
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Table 1 

Survey used for evaluation of the Program in 1999-2000 and in 2000-2001 

For the [Program] professional development in which you participated in the last eighteen 

months, to what extent have those activities 

Not at all To a small To some extent To a great 
extent extent 



1 . Contributed to the improvement 1 
of your instruction? 

2. Helped you to implement 1 

curriculum and performance 
standards? 

3. Promoted collaboration with 1 

other teachers? 

4. Been appropriate for the grade 1 

level(s) and subject(s) you teach? 



2 

2 

2 

2 



3 

3 

3 

3 



4 

4 

4 

4 



How often is [the Program’s] professional development for teachers at this school 

Never Rarely Sometimes Frequently Always 



5. Designed or chosen to support 1 2 3 4 5 

the goals of the school’s [state- 

required improvement plan? 

6. Designed or chosen to support 1 2 3 4 5 

the goals of the district’s [state- 

required improvement plan]? 

7. Planned by teachers at this 

school? 

8. Have you participated in any [Program] professional development activities that focus on 
student assessment {e.g., methods of testing, evaluation, performance assessment(s), or 
rubric(s)? 



Yes 



No 



Then what was the impact of 
the activities? 



Not at Very 

all useful 

useful 

1 2 3 4 5 



Table 2 

Percent of teachers responding with “to some extent” or “to a great extent” to the question. “For the IPrograml professional 
development in which you participated in the last eighteen months, to what extent have those activities: 
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Table 4 




Ten Most-Often-Mentioned Tonics about Teaching and learning 


Topic 


Number of 
teachers that 
mentioned it 


Elementary Schools 

Literacy/vocabulary building/4-block reading 


129 


Alternative assessment 


68 


Standards/benchmarks/curriculum mapping 


60 


Teaching methods 


60 


Learning and teaching styles/brain research 


45 


State-mandated tests/accountability 


44 


Technology 


39 


Classroom management 


36 


Enablers and disenablers in the organization 


29 


Special education 


27 


Middle Schools 

Standards/benchmarks/curriculum mapping 


35 


Alternative assessment 


27 


Teaching methods 


22 


Socratic seminar 


22 


Technology 


21 


Thematic unit plaiming 


18 


Classroom management 


16 


Learning and teaching styles/brain research 


14 


State-mandated tests/accountability 


13 


Collegial coaching/critical friends group 


13 


High Schools 

Standards/benchmarks/curriculum mapping 


32 


Four by four block scheduling for secondary schools 


24 


Literacy/vocabulary building/4-block reading 


16 


Student motivation, lack of 


14 


Advanced placement/gifted 


13 


Alternative assessment 


12 


Learning and teaching styles/brain research 


12 


Teaching methods 


11 


Thematic unit plarming 


10 


State-mandated tests/accountability 


9 
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Table 5 

Comparison of amount of talk about teaching and learning reported bv elementary, middle, and 

high schools, controlling for size of the populations 

1. Number of 2. Number of items listed Ratio of 





teachers employed 




Column 2 to Column 1 


Elementary 


727 


733 


1.0 


Middle 


396 


292 


0.74 


High 


520 


233 


0.45 



Table 6 

Comparison of amount of talk about teaching and learning reported bv non-Principals’ Institute 
and Principals’ Institute schools, controlling for size of the populations 





1. Number of 


2. Number 


Ratio of 




teachers 


of items 


Colunrn 2 to 


Non-Principals’ Institute Schools 
Principals’ Institute Schools 


employed 

1088 

558 


listed 

910 

349 


Column 1 

0.84 

0.63 
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